{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lesson 4: Image Generation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"background-color:#fff6e4; padding:15px; border-width:3px; border-color:#f5ecda; border-style:solid; border-radius:6px\"> ⏳ <b>Note <code>(Kernel Starting)</code>:</b> This notebook takes about 30 seconds to be ready to use. You may start and watch the video while you wait.</p>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* In this classroom, the libraries have been already installed for you.\n",
    "* If you would like to run this code on your own machine, you need to install the following:\n",
    "    ```\n",
    "    !pip install -q accelerate torch diffusers transformers comet_ml\n",
    "    ```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up Comet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "import comet_ml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "comet_ml.init(anonymous=True, project_name=\"4: Diffusion Prompting\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 97
   },
   "outputs": [],
   "source": [
    "# Create the Comet Experiment for logging\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "logged_artifact = exp.get_artifact(\"L4-data\", \"anmorgan24\")\n",
    "local_artifact = logged_artifact.download(\"./\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load images"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "from PIL import Image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "image=Image.open(\"L4_data/boy-with-kitten.jpg\").resize((256, 256))\n",
    "image_mask=Image.open(\"L4_data/cat_binary_mask.png\").resize((256, 256))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">Note: the images referenced in this notebook have already been uploaded to the Jupyter directory, in this classroom, for your convenience. For further details, please refer to the **Appendix** section located at the end of the lessons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "# Print the image\n",
    "plt.imshow(image)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 46
   },
   "outputs": [],
   "source": [
    "# Print the mask\n",
    "plt.imshow(image_mask)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import and prepare the model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Import [torch](https://pytorch.org/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 63
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Initialize the Stable Diffusion inpainting pipeline.\n",
    "  -- Note, if you'd like to learn more about the `float16` versus `bfloat16` data type and when you would use one or the other, please check out the short course [\"Quantization Fundamentals\" Lesson \"Data Types and Sizes](https://learn.deeplearning.ai/courses/quantization-fundamentals/lesson/3/data-types-and-sizes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 131
   },
   "outputs": [],
   "source": [
    "from diffusers import StableDiffusionInpaintPipeline\n",
    "sd_pipe = StableDiffusionInpaintPipeline.from_pretrained(\n",
    "    \"./models/stabilityai/stable-diffusion-2-inpainting\",\n",
    "    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.bfloat16,\n",
    "    low_cpu_mem_usage=False if torch.cuda.is_available() else True,\n",
    ").to(device)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 80
   },
   "outputs": [],
   "source": [
    "# Set the value of seed manually for reproducibility of the results\n",
    "seed = 66733\n",
    "generator = torch.Generator(device).manual_seed(seed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "prompt = \"a realistic green dragon\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Note\n",
    "- Starting from this point, the code that generates the images will  take too long to run in the given classroom environment. \n",
    "  - So the code is left as markdown in the classroom.\n",
    "- In the classroom, and regardless of whether you have access to GPUs, you can still run the code that retrieves these image generation results using the experiment tracking tool.\n",
    "> - Thank you for your understanding as we try to make these courses free and accessible to as many people as possible. 💕 💫\n",
    "\n",
    "- **Hardware requirements:** \n",
    "To generate the images taught in this lab, a machine with at least 8 GB of CPU should suffice, and you can expect results within approximately 3 minutes for 3 inference steps. However, for tasks involving 10 steps, using a CPU may extend the execution time to about 10 minutes. Tasks involving 100 steps will significantly prolong the execution time on a CPU.\n",
    "Alternatively, consider utilizing a GPU for faster processing. With a GPU, such as a local one or through platforms like [Colab GPU](https://colab.research.google.com/), all three steps can be completed in under 1 second."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Define new **Comet** experiment.\n",
    "- The following image generation code will takes hours without a GPU.\n",
    "- Its results are saved with an experiment tracking tool (Comet), so that you can retrieve them in this classroom environment (and on any computer, regardless of GPU access)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "output = sd_pipe(\n",
    "  image=image,\n",
    "  mask_image=image_mask,\n",
    "  prompt=prompt,\n",
    "  generator=generator,\n",
    "  num_inference_steps=3,\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "```Python\n",
    "generated_image = output.images[0]\n",
    "\n",
    "exp.log_image(\n",
    "    generated_image,\n",
    "    name=f\"{prompt}\",\n",
    "    metadata={\n",
    "        \"prompt\": prompt,\n",
    "        \"seed\": seed,\n",
    "        \"num_inference_steps\": 3\n",
    "    }\n",
    ")\n",
    "\n",
    "exp.end()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve the experiment results\n",
    "- Regardless of the environment that you are running in, you can retrieve the results of the experiment using the experiment tracking tool (Comet)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 182
   },
   "outputs": [],
   "source": [
    "import io\n",
    "\n",
    "reference_experiment = comet_ml.APIExperiment(\n",
    "    workspace=\"ckaiser\",\n",
    "    project_name=\"4-diffusion-prompting\",\n",
    "    previous_experiment=\"b1b9e80bb0054b52a8914beec97d36a6\"\n",
    ")\n",
    "\n",
    "reference_image = reference_experiment.get_asset_by_name(f\"{prompt}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  Print the reference_image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "plt.imshow(Image.open(io.BytesIO(reference_image)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Note: \n",
    "- We'll now explore different hyperparameters.\n",
    "- As before, running the image generation code would take hours in the classroom environment (or in any environment with GPUs)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Set up a different 'number of inference steps'."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "```Python\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "prompt = \"a realistic green dragon\"\n",
    "\n",
    "exp.log_parameters({\n",
    "    \"seed\": seed,\n",
    "    \"num_inference_steps\": 100\n",
    "})\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "output = sd_pipe(\n",
    "  image=image,\n",
    "  mask_image=image_mask,\n",
    "  prompt=prompt,\n",
    "  generator=generator,\n",
    "  num_inference_steps=100,\n",
    ")\n",
    "\n",
    "generated_image = output.images[0]\n",
    "\n",
    "exp.log_image(\n",
    "    generated_image,\n",
    "    name=f\"{prompt}\",\n",
    "    metadata={\n",
    "        \"prompt\": prompt,\n",
    "        \"seed\": seed,\n",
    "        \"num_inference_steps\": 100\n",
    "    }\n",
    ")\n",
    "\n",
    "exp.end()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve the experiment results\n",
    "- In the classroom or in any environment, you can retrieve the results of the image generation run by accessing the logs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 148
   },
   "outputs": [],
   "source": [
    "reference_experiment = comet_ml.APIExperiment(\n",
    "    workspace=\"ckaiser\",\n",
    "    project_name=\"4-diffusion-prompting\",\n",
    "    previous_experiment=\"948c8e6cfd23420c86a0de5f65719955\"\n",
    ")\n",
    "\n",
    "reference_image = reference_experiment.get_asset_by_name(f\"{prompt}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 29
   },
   "outputs": [],
   "source": [
    "plt.imshow(Image.open(io.BytesIO(reference_image)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Set up the different 'Guidance Scale' values.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- This code is best run on a GPU.  It's left as markdown in the classroom."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "import numpy as np\n",
    "guidance_scale_values = [x for x in np.arange(0, 21, 10)]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "prompt = \"a realistic green dragon\"\n",
    "\n",
    "num_inference_steps = 100 #if torch.cuda.is_available() else 10\n",
    "\n",
    "exp.log_parameters({\n",
    "    \"seed\": seed,\n",
    "})\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Pass the guidance_scale to this pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "for guidance_scale in guidance_scale_values:\n",
    "\n",
    "    output = sd_pipe(\n",
    "      image=image,\n",
    "      mask_image=image_mask,\n",
    "      prompt=prompt,\n",
    "      generator=generator,\n",
    "      num_inference_steps=num_inference_steps,\n",
    "      guidance_scale=guidance_scale\n",
    "    )\n",
    "\n",
    "    generated_image = output.images[0]\n",
    "\n",
    "    exp.log_image(\n",
    "        generated_image,\n",
    "        name=f\"{prompt}\",\n",
    "        metadata={\n",
    "            \"prompt\": prompt,\n",
    "            \"seed\": seed,\n",
    "            \"num_inference_steps\": num_inference_steps,\n",
    "            \"guidance_scale\": guidance_scale\n",
    "        }\n",
    "    )\n",
    "\n",
    "exp.end()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve the experiment results\n",
    "- As before, regardless of whether you have access to GPUs or not, you can retrieve the results of the image generation code from the logs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 131
   },
   "outputs": [],
   "source": [
    "reference_experiment = comet_ml.APIExperiment(\n",
    "    workspace=\"ckaiser\",\n",
    "    project_name=\"4-diffusion-prompting\",\n",
    "    previous_experiment=\"b34b94f94c594802b7090b6f2f1224f2\"\n",
    ")\n",
    "\n",
    "reference_experiment.display(tab=\"images\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Set up another hyperparameter: 'strength'."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Add the strength hyperparameter\n",
    "- This code is best run on a GPU.  It's left as markdown in the classroom."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "strength_values = [x for x in np.arange(0.1, 1.1, 0.2)]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "prompt = \"a realistic green dragon\"\n",
    "\n",
    "num_inference_steps = 200 if torch.cuda.is_available() else 10\n",
    "\n",
    "exp.log_parameters({\n",
    "    \"seed\": seed,\n",
    "})\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "for strength in strength_values:\n",
    "\n",
    "    output = sd_pipe(\n",
    "      image=image,\n",
    "      mask_image=image_mask,\n",
    "      prompt=prompt,\n",
    "      generator=generator,\n",
    "      num_inference_steps=num_inference_steps,\n",
    "      strength=strength\n",
    "    )\n",
    "\n",
    "    generated_image = output.images[0]\n",
    "\n",
    "    exp.log_image(\n",
    "        generated_image,\n",
    "        name=f\"{prompt}\",\n",
    "        metadata={\n",
    "            \"prompt\": prompt,\n",
    "            \"seed\": seed,\n",
    "            \"num_inference_steps\": num_inference_steps,\n",
    "            \"strength\": strength\n",
    "        }\n",
    "    )\n",
    "\n",
    "exp.end()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve the experiment results\n",
    "- With experiment tracking, you can compare the most recent run with the earlier ones."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 131
   },
   "outputs": [],
   "source": [
    "reference_experiment = comet_ml.APIExperiment(\n",
    "    workspace=\"ckaiser\",\n",
    "    project_name=\"4-diffusion-prompting\",\n",
    "    previous_experiment=\"2964615a382d46f09c3a36c50c74deef\"\n",
    ")\n",
    "\n",
    "reference_experiment.display(tab=\"images\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Try adding a Negative Prompt.\n",
    "- If you set the negative prompt to \"cartoon\", this is asking the image generation model to not generate an image that looks like a cartoon.\n",
    "- Again, the image generation code is best run on a GPU."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "exp = comet_ml.Experiment()\n",
    "\n",
    "prompt = \"a realistic green dragon\"\n",
    "negative_prompt = \"cartoon\"\n",
    "\n",
    "num_inference_steps = 100 if torch.cuda.is_available() else 10\n",
    "\n",
    "exp.log_parameters({\n",
    "    \"seed\": seed,\n",
    "})\n",
    "\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```Python\n",
    "output = sd_pipe(\n",
    "  image=image,\n",
    "  mask_image=image_mask,\n",
    "  prompt=prompt,\n",
    "  negative_prompt=negative_prompt,\n",
    "  generator=generator,\n",
    "  num_inference_steps=num_inference_steps,\n",
    "  guidance_scale=10\n",
    ")\n",
    "\n",
    "generated_image = output.images[0]\n",
    "\n",
    "exp.log_image(\n",
    "    generated_image,\n",
    "    name=f\"{prompt}\",\n",
    "    metadata={\n",
    "        \"prompt\": prompt,\n",
    "        \"seed\": seed,\n",
    "        \"num_inference_steps\": num_inference_steps,\n",
    "        \"guidance_scale\": 10\n",
    "    }\n",
    ")\n",
    "\n",
    "exp.end()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Retrieve the experiment results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "height": 199
   },
   "outputs": [],
   "source": [
    "reference_experiment = comet_ml.APIExperiment(\n",
    "    workspace=\"ckaiser\",\n",
    "    project_name=\"4-diffusion-prompting\",\n",
    "    previous_experiment=\"f05b04ac203a4f9aa606ea6cf9417fa3\"\n",
    ")\n",
    "\n",
    "reference_image = reference_experiment.get_asset_by_name(f\"{prompt}\")\n",
    "\n",
    "plt.imshow(Image.open(io.BytesIO(reference_image)))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Additional Resources\n",
    "* For more on how to use [Comet](https://www.comet.com/site/?utm_source=dlai&utm_medium=course&utm_campaign=prompt_engineering_for_vision_models&utm_content=dlai_L4) for experiment tracking, check out this [Quickstart Guide](https://colab.research.google.com/drive/1jj9BgsFApkqnpPMLCHSDH-5MoL_bjvYq?usp=sharing) and the [Comet Docs](https://www.comet.com/docs/v2/?utm_source=dlai&utm_medium=course&utm_campaign=prompt_engineering_for_vision_models&utm_content=dlai_L4).\n",
    "* This course was based off a set of two blog articles from Comet. Explore them here for more on how to use newer versions of Stable Diffusion in this pipeline, additional tricks to improve your inpainting results, and a breakdown of the pipeline architecture:\n",
    "  * [SAM + Stable Diffusion for Text-to-Image Inpainting](https://www.comet.com/site/blog/sam-stable-diffusion-for-text-to-image-inpainting/?utm_source=dlai&utm_medium=course&utm_campaign=prompt_engineering_for_vision_models&utm_content=dlai_L4)\n",
    "  * [Image Inpainting for SDXL 1.0 Base Model + Refiner](https://www.comet.com/site/blog/image-inpainting-for-sdxl-1-0-base-refiner/?utm_source=dlai&utm_medium=course&utm_campaign=prompt_engineering_for_vision_models&utm_content=dlai_L4)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
